{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 距离衡量\n",
    "对于方程$Ax + By + C = 0$点到直线距离\n",
    "$$\n",
    "d = \\frac{\\vert Ax + By + C\\vert}{\\sqrt{A^2 + B^2}}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "推广到多维，对于$WX + B = 0$\n",
    "$$\n",
    "d = \\frac{WX + B}{\\Vert W\\Vert}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 距离限制"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "对于分类任务，我们想要找到这么一条分界线$Y = WX + B$\n",
    "$$\n",
    "\\left\\{\n",
    "\\begin{matrix}\n",
    "Y_i =& WX_i + B \\le& -1 \\\\\n",
    "Y_j =& WX_j + B \\ge& 1\n",
    "\\end{matrix}\n",
    "\\right.\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "每个点到该直线/平面的距离为\n",
    "$$\n",
    "\\gamma_i = \\frac{\\vert WX_i + B \\vert}{\\Vert W \\Vert }\n",
    "$$\n",
    "最小的距离为\n",
    "$$\n",
    "\\gamma = \\min_{1,\\cdots,n}\\frac{\\vert WX_i + B \\vert}{\\Vert W \\Vert }\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "对于二分类任务\n",
    "$$\n",
    "\\left\\{\n",
    "\\begin{matrix}\n",
    "\\frac{ WX_i + B }{\\Vert W \\Vert} \\le \\gamma & Y_i \\le -1 \\\\\n",
    "\\frac{ WX_j + B }{\\Vert W \\Vert} \\ge \\gamma & Y_j \\ge -1\n",
    "\\end{matrix}\n",
    "\\right.\n",
    "$$\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "统一表达的形式，为\n",
    "$$\n",
    "\\frac{Y(WX+B)}{\\Vert W \\Vert} \\ge \\gamma\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "两边同时除以$\\gamma$\n",
    "$$\n",
    "\\frac{Y(WX + B)}{\\Vert W\\Vert \\gamma} \\ge 1\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "更新参数\n",
    "$$\n",
    "\\left\\{\n",
    "\\begin{matrix}\n",
    "w = \\frac{W}{\\Vert W\\Vert \\gamma} \\\\\n",
    "b = \\frac{B}{\\Vert W\\Vert \\gamma}\n",
    "\\end{matrix}\n",
    "\\right.\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "得到\n",
    "$$\n",
    "y(wx + b) \\ge 1\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 关键点\n",
    "我们得到的最小的$\\gamma$，它等价于$\\frac{1}{\\Vert W \\Vert}$,最后的目标就变成了\n",
    "$$\n",
    "\\max\\frac{1}{2}\\Vert W\\Vert ^2\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 拉格朗日乘子法"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "\\begin{matrix}\n",
    "&\\min f(x)\\\\ & s.t. g(x) = C\n",
    "\\\\\n",
    "\\end{matrix}\n",
    "$$\n",
    "对于这种问题，通过构造函数\n",
    "$$\n",
    "G(x) = C - g(x)\n",
    "$$\n",
    "拉格朗日函数为\n",
    "$$\n",
    "L(x) = f(x) + \\lambda G(x)\n",
    "$$\n",
    "解方程组\n",
    "$$\n",
    "\\left\\{\n",
    "\\begin{matrix}\n",
    "\\triangledown f(x) = 0 \\\\\n",
    "G(x) = 0\n",
    "\\end{matrix}\n",
    "\\right.\n",
    "$$\n",
    "即可求的最大值"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 例子\n",
    "已知$x + y + z = 12$，求$f(x, y, z) = x^3y^2z$的最大值。\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "\\begin{aligned}\n",
    "&\\left\\{\n",
    "\\begin{matrix}\n",
    "f(x, y, z) =& x^3y^2z \\\\\n",
    "g(x, y, z) =& x + y + z - 12\n",
    "\\end{matrix}\n",
    "\\right. \\Rightarrow L(x, y, z, \\lambda) = f(x, y, z) + \\lambda g(x, y, z)\n",
    "\\\\ 令偏导为0 :\\\\\n",
    "&\\left\\{\n",
    "\\begin{matrix}\n",
    "\\frac{\\partial L}{\\partial x} =& 3x^2y^2z  + \\lambda&= 0 \\\\\n",
    "\\frac{\\partial L}{\\partial y} =& 2x^3yz  + \\lambda &= 0 \\\\\n",
    "\\frac{\\partial L}{\\partial z} =& x^3y^2 + \\lambda &= 0 \\\\\n",
    "\\frac{\\partial L}{\\partial \\lambda} = &x + y + z - 12 &= 0\n",
    "\\end{matrix}\n",
    "\\right. \\\\\n",
    "前三个等式两两联立，得到:\\\\\n",
    "&\\left\\{\n",
    "\\begin{matrix}\n",
    "3x^2y^2z = 2x^3yz &\\Rightarrow & x = \\frac{3}{2}y \\\\\n",
    "2x^3yz = x^3y^2 &\\Rightarrow & z = \\frac{1}{2}y \n",
    "\\end{matrix}\n",
    "\\right. \\\\\n",
    "带入最后一个,得到 \\\\\n",
    "& \\lambda (x + y + z - 12) = 3\\lambda (y - 4) = 0 \\Rightarrow y = 4 \\\\\n",
    "& \\left\\{\n",
    "\\begin{matrix}\n",
    "x = \\frac{3}{2}y = 6 \\\\\n",
    "y = 4 \\\\\n",
    "z = \\frac{1}{2}y = 2\n",
    "\\end{matrix}\n",
    "\\right.\n",
    "\\end{aligned}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "\\max f(x, y, z) = x^3y^2z|_{x=6, y=4, z=2} = 216 \\times 16 \\times 2 = 6912\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# KKT"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "\\begin{matrix}\n",
    "\\max f(w) \\\\\n",
    "s.t. h(w) = 0 \\\\\n",
    "s.t. g(w) \\le 0\n",
    "\\end{matrix}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在存在等式和不等式约束的情况下，拉格朗日方程为\n",
    "$$\n",
    "L(w, \\alpha, \\beta) = f(w) + \\sum_i^n \\alpha_i h_i(w) + \\sum_i^m\\beta_ig_i(w)\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`KKT` 条件\n",
    "$$\n",
    "\\begin{aligned}\n",
    "\\left\\{\n",
    "\\begin{matrix}\n",
    "\\triangledown L(w) &=& 0 \\\\\n",
    "h_i(w) &=& 0 \\\\\n",
    "g_i(w) &\\le & 0 \\\\\n",
    "\\alpha_i & \\gt & 0 \\\\\n",
    "\\beta_i &\\ge & 0 \\\\\n",
    "\\beta_i g_i(w) &=& 0\n",
    "\\end{matrix}\n",
    "\\right.\n",
    "\\end{aligned}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 条件对偶\n",
    "具体推导方式可以参看如下博客，还未理解，表述不明\n",
    "- https://blog.csdn.net/the_lastest/article/details/78461566\n",
    "- https://zhuanlan.zhihu.com/p/36621652"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "\\min_{w}L(w, \\alpha, \\beta) \\le L(w, \\alpha, \\beta) \\le \\max_{\\alpha, \\beta; \\beta \\ge 0} L(w, \\alpha, \\beta)\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "条件始终满足，则最小的最大，始终小于等于最大的最小\n",
    "$$\n",
    "\\max_{\\alpha, \\beta;\\beta \\ge 0} \\min_w L(w, \\alpha, \\beta) \\le \\min_w \\max_{\\alpha, \\beta; \\beta \\ge 0} L(w, \\alpha, \\beta)\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "当满足`KKT`条件的时候，等号满足。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 例子\n",
    "$$\n",
    "\\begin{matrix}\n",
    "\\min_x & f(x) &=& x_1^2 + x_2^2 &\\\\\n",
    "s.t. & h(x) &=& x_1 - x_2 - 2 &=& 0\\\\\n",
    "& g(x) &= & (x_1 - 2)^2 + x_2^2 - 1 &=& 0\n",
    "\\end{matrix}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "L(x, \\alpha, \\beta) = (x_1^2 + x_2^2) + \\beta(x_1 - x_2 - 2) + \\alpha \\left[ (x_1 - 2)^2 + x_2^2 -1\\right]\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "\\left\\{\\begin{matrix}\n",
    "p^* &=& \\min_x \\max_{\\alpha,\\beta; \\beta \\ge 0} L(x, \\alpha, \\beta) \\\\\n",
    "d^* &=& \\max_{\\alpha,\\beta; \\beta \\ge 0} \\min_w L(x, \\alpha, \\beta)\n",
    "\\end{matrix}\n",
    "\\right.\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "首先不管$\\alpha, \\beta$，直接求偏导数求极值\n",
    "$$\n",
    "\\left\\{\n",
    "\\begin{matrix}\n",
    "\\beta + 2x_1 + \\alpha(2x_1 - 4) &=& 0 \\\\\n",
    "2x_2 - \\beta + 2\\alpha x_2 &=& 0\n",
    "\\end{matrix}\n",
    "\\right. \\Rightarrow\n",
    "\\left\\{\n",
    "\\begin{matrix}\n",
    "x_1 = \\frac{4\\alpha - \\beta}{2\\alpha + 2} \\\\\n",
    "x_2 = \\frac{\\beta}{2\\alpha + 2}\n",
    "\\end{matrix}\n",
    "\\right.\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "带入原来的拉格朗日式\n",
    "$$\n",
    "\\theta(\\alpha, \\beta) = \\frac{\\beta^2 + 4\\beta + 2\\alpha^2 - 6\\alpha}{2(\\alpha + 1)}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "\\left\\{\\begin{matrix}\n",
    "\\frac{\\partial \\theta}{\\partial \\alpha} &=& \\frac{2\\alpha^2 + 4 \\alpha - \\beta^2 - 4 \\beta - 6}{2(\\alpha + 1)^2} &=& 0 \\\\\n",
    "\\frac{\\partial \\theta}{\\partial \\beta} &=& \\frac{2\\beta + 4}{2(\\alpha + 1)} &=& 0\n",
    "\\end{matrix}\\right. \\Rightarrow\n",
    "\\left\\{\n",
    "\\begin{matrix}\n",
    "\\alpha = \\sqrt 2 - 1 \\\\\n",
    "\\beta = -2\n",
    "\\end{matrix}\n",
    "\\right.\n",
    "$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
